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Abstract  We  develop  a  method  for  non-paraxial  beam  propagation  that  obtains  a  speed 
improvement  over  the  Finite-Difference  Split-Step  method  (FDSSNP)  recently  reported  by 
Sharma  et  al.  The  method  works  in  the  eigen-basis  of  the  Laplace  operator  (v|.),  and  in 
general  requires  half  as  many  operations  to  propagate  one  step  forward  so  that  a  2X  speedup 
can  be  realized.  However,  the  new  formulation  allows  the  Fast  Fourier  Transform  (FFT) 
algorithm  to  be  used,  which  allows  an  even  greater  speedup.  The  method  does  not  require  a 
numerical  matrix  inversion,  diagonalization,  or  series  evaluation.  The  diffraction  operator  is 
not  approximated,  and  in  the  absence  of  refractive  index  fluctuations  the  method  reduces  to 
an  exact  solution  of  the  Helmholtz  equation. 
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1  Introduction 

Beam  propagation  methods  (BPM)  are  a  large  class  of  numerical  methods  for  solving  the 
scalar  Helmoltz  equation.  They  are  popular  for  simulating  guided  waves  and  laser  beams, 
as  they  are  typically  both  fast  and  efficient.  Early  methods  used  the  paraxial  approxima¬ 
tion,  which  greatly  simplified  the  problem  by  reducing  the  propagation  equation  to  first 
order,  but  they  were  severely  limited  in  their  application  because  any  beam  profile  containing 
spacial  frequencies  with  angles  greater  than  a  few  degrees  with  respect  to  the  propaga¬ 
tion  axis  incur  significant  phase  errors.  Many  methods  were  developed  to  drop  the  paraxial 
approximation  and  include  wide-angle  waves.  In  several  of  these,  the  Helmholtz  equation  is 
formally  rewritten  as  a  first-order  differential  equation  which  includes  the  square  root  of  an 
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operator.  Then  the  square  root  operator  is  either  approximated  using  real  or  complex  Pade 
approximants  and  a  Finite-Difference  or  iterative  method  is  used  to  solve  the  equation  (Hadley 
1992;  Le  et  al.  2008),  or  the  formal  solution  is  written  which  results  in  an  exponential  of  the 
square  root  of  an  operator  that  is  then  approximated  with  a  Pade  approximant  (Lu  and  Ho 
2002). 

Recently,  Sharma  extended  an  operator- splitting  technique  used  on  the  paraxial  wave 
equation  to  the  non-paraxial  wave  equation  (Helmholtz  equation)  Sharma  and  Agrawal 
(2004).  The  splitting  allows  diffraction  and  the  refractive  index  variations  to  be  handled 
separately,  and  various  numerical  methods  can  be  used  after  the  operator  has  been  split,  such 
as  collocation  or  Finite-Differencing  (Sharma  and  Agrawal  2004,  2006).  Here  we  develop  a 
method  that  represents  the  beam  profile  in  the  eigenbasis  of  the  diffraction  operator  and  uses 
the  operator- splitting  technique  to  account  for  the  refractive  index  variations.  In  general,  the 
method  provides  a  2X  speedup  to  the  Finite-Difference  method  reported  by  Sharma,  but  this 
can  be  increased  by  use  of  the  Fast  Fourier  Transform  algorithm. 


2  Formulation 

Beam  propagation  in  a  non-uniform  refractive  index  distribution  is  described  by  the  Scalar 
Wave  Equation, 

^2 

^2  9  (z> x)  +  Vr^  (x)  +  ko n2*  (z,  x)  +  *§  (n2  (z,  x)  -  n 2)  9  (z,  x)  =  0.  (1) 

Here,  (z,  x)  is  the  scalar  electric  field  given  in  two  or  three  dimensions  and  Vf  is  the  one  or 
two  dimensional  transverse  part  of  the  Laplacian.  As  usual,  ko  denotes  the  free-space  wave- 
number  of  the  beam,  n 2  (z,  x)  is  the  refractive  index  of  the  medium  and  h2  is  a  background 
refractive  index,  usually  taken  to  be  min  { n 2  (z,  x)}.  The  term  ( n 2  (z,  x)  —  h2)  can  be  viewed 
as  a  perturbation  to  the  homogeneous  propagation  in  a  media  with  refractive  index  n.  Con¬ 
sider  the  set  of  eigenfunctions  {<pn  (x))  and  eigenvalues  (— X2)  of  the  transverse  Laplacian 
satisfying  the  waveguide  boundary  condition.  They  are  defined  as 

Vj(pn  (x)  =  -A2</>„  (x)  , 

J  cp*  (x)  <t>m  (x)  dx  =  <$„,m ,  (2) 

<Pn  (X)  =  0, 

where  X  is  the  coordinate  of  the  waveguide  wall.  The  eigenfunctions  form  a  complete  basis, 
so  we  expand  the  field  profile 


oo 

^(z,x)  =  ^a„  (z)4>n  (x).  (3) 

n= 1 

Inserting  3  into  Eq.  1 ,  then  multiplying  by  </>^  (x)  and  integrating  gives  a  system  of  differential 
equations, 


(Z) 


OO  „ 

-  (k\h2  -  A.2  )  am  (z)  -  X  / 

n=  1  ‘ 


(»2  (z. 


s)  -  «2)  0rn  (x)  0«  (x)  dx  an  (z) . 


(4) 
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The  eigenfunction  profiles,  4>n,  are  propagating  modes,  and  Eq.  4  is  the  set  of  equations  that 
govern  the  mode  amplitudes.  If  the  refractive  index  were  constant,  i.e.  n2  (z,  x)  —  h2  =  0, 
then  the  modes  would  each  propagate  independent  of  the  others.  However,  the  last  term  of 
Eq.  4  couples  these  modes,  allowing  energy  to  transfer  between  them.  Note  also  that  this  term 
is  just  the  refractive  index  operator  (which  is  diagonal  in  the  spatial  coordinates)  represented 
in  the  basis  of  eigenfunctions,  and  can  be  written  in  operator  form  as  §N  (z)  §-1a. 

We  now  proceed  to  implement  the  split-step  method  described  by  Sharma  and  Agrawal 
(2004,  2006),  Sharma  et  al.  (2007).  Equation  4  can  be  written  in  matrix  form, 


bm  —  dzClm , 


dzA  =  H  (z)  A, 


H  (z)  = 


"0 

.  -  (*0«2 


I 

X2m)  I  -  SN  (z)  S-1  0 


(5) 


which  has  the  solution  A  (zo  +  A z)  =  em(^Az  A  (zo)-  The  operator  H  (z)  can  be  written  as 
the  sum  of  two  operators, 


H  (z)  =  Hi  +  H2  (z)  , 

Hl  -  [ -  (^o«2  -  xi) 11  °]’ 

e2  (z)  =  [  _gN  (z)  g— 1  0j. 

so  that 

A  (zo  +  Az)  =  e(M2+M2(z))AzA  (z0) .  (9) 

The  matrix  exponential  cannot  simply  be  split  into  the  product  of  two  matrix  exponentials 
because  the  matrices  Hi  and  H2  (z)  do  not  in  general  commute.  Instead,  the  exponential  is 
approximated  using  a  symmetrized  operator  splitting  technique  (Sharma  and  Agrawal  2004, 
2006;  Dattoli  et  al.  1998), 

^EI(z)Az  ^  e\Wi^ze^2{z)^ze\^i^z 


(6) 

(7) 

(8) 


which  is  second-order  accurate  in  Az  (Sharma  and  Agrawal  2004,  2006;  Dattoli  et  al.  1998; 
Fleck  et  al.  1976).  It  is  clear  that  e  2 Hl  Az  propagates  the  mode  a  half  step  forward  in  a  uniform 
media,  so  £M2(z)Az  must  account  for  the  inhomogeneous  refractive  index. 

The  next  step  is  to  compute  the  two  matrix  exponentials.  Expanding  the  exponential  in  its 
power  series  form  and  evaluating  the  sum  gives  (Sharma  and  Agrawal  2006;  Sharma  et  al. 
2007), 


eM2(z)Az  _ 


_  0 

-  [— SN(z)S“1Az  I 

gjIHIlA z  _  p 


C0S  (fit,  mIAz/2^  sin  ^Jk^mIAz/2^ 

—■sJk'2,m  I  sin  (yi^iAz/2)  cos  XfiU^/2) 


(11) 

(12) 

(13) 


where  we  have  used  k2m  =  (k^n2  —  A^).  The  difference  between  this  and  the  Finite- 
Difference  formulation  is  that  the  operators  under  the  square  root  are  already  diagonal, 
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Table  1  Application  to  various 
coordinate  systems 


2D,  Cartesian  coordinates 


(3^  =  0) 


WT  ^  3x2 
§  — ►  Sine  Transform 

4>  (x)  ->►  sin  (knx) 


-ki 


b  ,  >UL 
Kn  X 

2D,  Cylindrical  coordinates  (d^  &  =  0^ 

\/2  v  199 
WT  ^  r  drrdr 

§  — ►  Hankel  Transform 
0  (x)  ->  J0  ( pnr ) 

-A2  ->  -pi 
Pn^lt 

3D,  Cartesian  coordinates 


Vz  ^ 

T  dx2  3 y2 

§  — y  2D  Sine  Transform 

0  (x)  -*  sin  (knx)  sin  (kmy) 


b  ,  nn 
Kn  X 
mn 
Y 


-  (k2  +  k2^ 


k  .  m7T 
“ v- 


so  they  can  be  directly  evaluated.  These  two  matrices  can  be  used  to  propagate  an  initial  A 
forward  one  step  in  the  z  direction.  The  complete  propagation  algorithm  is  then  written  as 

A(zo  +  n  Az)  =  PQ  (zo  +  (n  -  1)  Az)  P  •  •  •  PQ  (zo  +  Az)  PPQ  (zo)  PA  (zo)  • 

However,  we  note  that  since  [Hi,  Hi]  =  0, 

e^MiAzejMiAz  _  eMiAz 


So  far,  the  derivation  of  the  propagation  algorithm  has  not  depended  on  the  eigenfunctions 
0n,  only  their  properties  (Eq.  2).  This  is  convenient  because  the  method  is  independent  of 
the  specific  coordinate  system  used  (only  that  it  is  rectilinear  in  z).  Choosing  a  coordinate 
system  then  sets  the  eigenfunctions  which  defines  S  and  the  eigenvalues,  —X2  (See  Table  1). 
In  principle,  any  coordinate  system  could  be  used,  as  long  as  a  numerical  representation  of 
the  associated  eigenfunction  basis  transform  is  available. 


3  Numerical  example 

To  test  the  method,  and  compare  it  to  Sharma’s  FDSSNP  method,  we  have  simulated  prop¬ 
agation  through  the  2-dimensional  (Cartesian)  symmetric  Epstein-layer  waveguide,  whose 
refractive  index  profile  is  given  as 
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Table  2  Test  case  configuration  .  _9  .  9  .  .  ..  .  . 

nl  (x,  z)  =  nz  +  2nAnsectr  (2  (x  cos#  —  z  sin#)  /w) 

h  =  2.1455 

An  =  0.003 

w  =  5  /i  m 

A.Q  =  1.3  jum 

#  =  0°,  50° 


n2  (x)  =  h2  +  2/zA/zsech2  (2x/w) .  (14) 

Table  2  lists  the  specific  values  used  in  the  configuration.  The  zero’ th  order  mode  is 
&0  (x,0)  =  sechw  (2 x/w) , 

V0(x,z)  =  Vo(x,0)eikzOZ, 


2nAn  —  1 


*— 

If  the  waveguide  is  rotated  such  that  it  makes  an  angle  6  with  the  z-axis,  the  refractive  index 
and  field  become 

n 2  (x,  z)  =  n2  +  2h  A/zsech2  (2  (x  cos  0  —  z  sin  0)  /w) , 
tf'o  (A',0)  =  sechw  (2  (x  cos  0)  /io)  sinS, 

%  (x,z)  =  Vo  (x,  0)  eikzOZCOS0. 


We  use  the  correlation  factor  to  measure  error, 

CF(z)  =  l/^o*  (j:,z)^o(x,z)^|2 

l/^o*  dx |2 

=  1  -  CF(z), 

where  is  the  exact  beam  profile  and  t/f  is  the  computed  profile,  as  it  provides  a  measure 
for  both  the  profile  shape  and  amplitude  (Sharma  and  Agrawal  2006). 

We  have  implemented  C++  versions  of  both  the  FDSSNP  and  our  method.  The  programs 
were  compiled  with  g++,  version  4.4.3,  and  all  simulations  where  ran  on  an  Intel  Core  2  Duo, 
2.66  GHz  processor  with  8  GB  of  RAM. 

The  eigenfunctions  of  the  transverse  Laplace  are  harmonic  sine  functions.  The  transform, 
S,  is  a  Sine  Transform,  and  can  be  applied  with  an  FFT  library.  We  used  the  FFTW  library, 
version  3.2.2.  Figures  1  and  2  show  the  error  with  propagation  distance  for  an  aligned  and 
rotated  waveguide  for  various  values  of  Az  and  demonstrate  the  algorithm’s  stability. 

The  sub-matrices  of  PP  in  the  Finite-Difference  representation  are,  in  general,  dense.  To 
reduce  the  number  of  operations  needing  during  the  propagation  steps,  the  sub-matrices  are 
precomputed.  We  have  used  the  BLAS  library  to  apply  PP,  which  is  known  to  be  very  efficient 
and  fast.  Our  Spectral  method  is  shown  to  be  about  twice  as  fast  for  the  smallest  transverse 
grid  we  tested,  300  points,  but  over  10  times  faster  for  the  large  grid,  1200  points  (Fig.  3). 
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waveguide  rotation:  0  degrees 


Fig.  1  Error  as  a  function  of  propagation  distance  for  aligned  waveguide 


Fig.  2  Error  as  a  function  of  propagation  distance  for  waveguide  tilted  at  50  degrees 


4  Discussion 


4.1  Speedup 


The  reason  for  the  observed  speedup  of  our  Spectral  method  follows  directly  from  the  form 
of  P  and  Q.  In  the  Finite-Difference  method,  P  and  Q  are  represented  as  (Sharma  et  al.  2007), 


Q  [  — N  (z)  Az  I  ]  ’ 


V  [cos  (ViKAz/2)]v 

V 

sin  (VlKAz/2) /Vk]  V 

-V  [ VI  sin  (VlKAz/2) 

]v  V 

cos  (ViKAz/2)]  V 

(15) 

(16) 


where  K  is  the  diagonal  matrix  constructed  using  the  series  representation  of  the  Finite- 
Difference  operator,  and  ¥  is  the  matrix  that  diagonalizes  the  Central-Difference  operator. 
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Fig.  3  A  comparison  of  the  time  required  to  propagate  200  steps  for  the  Finite-Difference  method  and  our 
Spectral  method  for  various  transverse  grid  resolutions 


As  more  terms  are  used  to  construct  IK,  P  becomes  increasingly  dense,  and  so  in  general,  P 
is  composed  of  four  Nt  x  Nt  matrices.1  The  matrix  Q  is  composed  of  three  diagonal  matri¬ 
ces,  but  only  one  needs  to  be  explicitly  applied.  Application  of  PQ  (or  PPQ)  requires  one 
vector-vector  element-wise  product  and  four  matrix-vector  multiplications  with  a  complex 
input  vector,  so  a  total  of  2  (ANj  +  Nt)  multiplications  are  required. 

The  form  of  P  and  Q  derived  here  only  require  2  (5  Nj  +  2Nj)  multiplications  to  be 
applied.  P  is  composed  of  four  diagonal  matrices,  which  requires  four  vector- vector  element¬ 
wise  products.  Q  only  contains  one  matrix  that  must  be  applied,  but  it  is  applied  using  one 
matrix- vector  multiplication  followed  by  a  vector- vector  element-wise  product  and  another 
matrix-vector  multiplication.  So  in  general,  the  Finite-Difference  method  requires 
times,  or  roughly  2  times,  as  many  multiplications  as  our  Spectral  method.  However,  as 
noted  above,  in  Cartesian  coordinates,  the  transformation  matrix  S  can  be  applied  using  an 
FFT  library,  which  can  apply  S  in  about  Nt  log  Nt  multiplications.  For  such  cases,  the  Finite- 
Difference  method  will  require  2k>gyvr+5  more  multiplications,  which  leads  to  significant 
speed  improvements  as  Nt  gets  large. 

4.2  Other  coordinate  systems 

Both  the  Finite-Difference  and  Spectral  formulations  can  be  be  applied  to  multiple  coordinate 
systems.  The  Finite-Difference  approach  requires  a  Finite-Difference  representation  of 
that  must  then  be  diagonalized  numerically  or  analytically,  whereas  the  Spectral  approach 
requires  numerical  representations  of  the  eigenfunction  transforms  associated  with  .  Shar- 
ma  has  recently  shown  that  the  Finite-Difference  approximation  of  in  2D  Cartesian  coor¬ 
dinates  (Nj  ->  d%)  (Sharma  et  al.  2007)  and  3D  Cartesian  coordinates  (V^  ->  +  3^) 

(Bhattacharya  and  Sharma  2007)  can  be  diagonalized  analytically.  Nash  has  shown  simi¬ 
larly  that  the  Finite-Difference  approximation  of  in  2D  Cylindrical  coordinates  (V^^ 
r~l  drr  3r)  (Nash  and Lopez-Mobilia  2004)  can  also  be  diagonalized  analytically,  which  could 
be  used  to  formulate  a  Finite-Difference  based  split- step  propagation  method  in  systems 
possessing  azimuthal  symmetry. 


1  Here,  Nt  denotes  the  number  of  transverse  grid  points. 


<£)  Springer 


C.  D.  Clark  III,  R.  J.  Thomas 


As  outlined  in  Table  1,  in  the  Spectral  method  formulation,  these  coordinate  systems 
require  ID  and  2D  Fourier  Transforms  which  can  be  computed  using  FFT  libraries  (Press 
et  al.  2002)  and  a  Hankel  Transform,  which  could  be  computed  using  the  Quasi-Discrete 
Hankel  Transform  (Yu  et  al.  1998;  Guizar-Sicairos  and  Gutie  2004)  for  example. 

4.3  Boundary  conditions 

We  have  assumed  the  boundary  condition  4>n  (X)  =  0  for  each  of  the  eigenfunctions,  which 
imposes  the  “hard  wall”  boundary  condition  on  the  field;  the  computational  domain  is  sur¬ 
rounded  by  a  perfectly  reflecting  metal  surface.  This  can  be  a  serious  problem,  especially 
when  simulating  photonic  devices,  because  it  prevents  energy  loss  through  the  radiation 
modes  as  any  wave  reaching  the  boundary  will  be  reflected  back  into  the  waveguide.  How¬ 
ever,  for  applications  in  beam  propagation,  this  is  typically  not  an  issue  because  the  beam 
profile  will  have  a  finite  size  and  the  hard  wall  boundary  can  be  placed  far  enough  away  such 
that  it  does  not  interfere  with  the  beam. 

It  would  also  be  possible  to  implement  non-reflecting  boundary  conditions,  such  as  the 
Perfectly  Matched  Layer  (PML),  although  our  applications  have  so  far  not  required  this. 
Multiple  non-reflecting  boundary  conditions  have  been  developed  for  the  Eigen  Mode  Expan¬ 
sion  method  (EME)  (Bienstman  and  Baets  2002),  and  the  same  methods  could  be  applied 
here.  For  example,  a  PML  can  derived  by  simply  considering  the  layer  to  have  a  complex 
thickness  (Bienstman  and  Baets  2001;  Derudder  et  al.  2001).  To  add  a  PML,  a  layer  with  an 
imaginary  component  to  its  space  coordinate  would  be  placed  before  the  hard  wall,  to  absorb 
all  incident  waves.  The  general  eigenfunctions  (and  eigenvalues)  of  would  then  need 
to  be  found  for  this  geometry  to  construct  the  basis  transformation,  S,  and  the  propagation 
algorithm  would  remain  unchanged. 

However,  the  change  in  basis  functions  to  include  a  PML  would  mean  that  the  FFT 
algorithm  could  not  be  used,  and  only  a  2X  speedup  over  the  FDSSNP  method  could  be 
achieved.  The  use  of  a  non-reflective  boundary  condition  would  most  likely  only  be  benefi¬ 
cial  when  boundary  interference  cannot  be  avoided  by  placing  the  metal  surface  farther  away, 
or  the  coordinate  system  does  not  allow  use  of  the  FFT  (such  as  cylindrical  coordinates).  For 
example,  transforming  a  vector  of  400  points  using  a  full  matrix-vector  multiplication  costs 
approximately  the  same  computational  time  as  transforming  a  vector  of  35,000  points  using 
the  FFT.  So  the  computational  domain  could  be  made  over  80  times  larger  using  the  FFT, 
before  the  matrix-vector  multiplication  became  more  efficient. 


5  Conclusions 

We  have  outlined  a  method  here  that  improves  the  performance  of  the  wide-angle  split- step 
beam  propagation  method  recently  developed  by  Sharma.  The  method  recasts  the  propaga¬ 
tion  equation  into  the  Spectral  domain  of  the  transverse  Laplacian  operator  and  depends  on 
numerical  representations  of  special  function  transforms,  rather  than  Finite-Difference  rep¬ 
resentations  of  the  transverse  Laplacian.  The  method  requires  roughly  half  as  many  floating 
point  multiplications  as  the  Finite-Difference  method  in  general  because  of  the  particular 
operator  splitting  used  to  approximate  the  operator  exponential,  the  FFT  can  be  used  to 
reduce  this  further.  We  have  demonstrated  the  algorithm  in  2D  Cartesian  coordinates,  but 
the  method  derived  here  is  equally  applicable  to  2  and  3D  beam  propagation  in  Cartesian  or 
Cylindrical  coordinates,  and  the  numerical  transforms  needed  for  both  cases  are  available  in 
the  literature. 
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